Not…Until across European Languages: A Parallel Corpus Study

نویسندگان

چکیده

We present a parallel corpus study on the expression of temporal construction ‘not…until’ in sample European languages. use data from Europarl and create semantic maps by multidimensional scaling, order to analyze cross-linguistic language-internal variation. This paper builds formal typological work, extending it including conditional constructions, as well connectives type long as. In an investigation 7 languages, we find that (i) languages many more different constructions convey this meaning than was expected literature; (ii) combination polarity marking (negation/assertion) strongly correlates with connective. corroborate our results larger 21 An analysis clusters dimensions based enlarged dataset shows are not randomly distributed across space ‘not…until’-domain.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computational and Linguistic Issues in Designing a Syntactically Annotated Parallel Corpus of Indo-European Languages

This paper reports on the development of the PROIEL parallel corpus of New Testament texts, which contains the Greek original of the New Testament and its earliest IndoEuropean translations, into Latin, Gothic, Old Church Slavic and Classical Armenian. A web application has been constructed specifically for the purpose of annotating the texts at multiple levels: morphology, syntax, alignment at...

متن کامل

A Parallel Corpus for Evaluating Machine Translation between Arabic and European Languages

We present Arab-Acquis, a large publicly available dataset for evaluating machine translation between 22 European languages and Arabic. Arab-Acquis consists of over 12,000 sentences from the JRCAcquis (Acquis Communautaire) corpus translated twice by professional translators, once from English and once from French, and totaling over 600,000 words. The corpus follows previous data splits in the ...

متن کامل

Alignment Across Oriental and Indo-European Languages

The linguistic characteristics of Oriental languages and Indo-European languages are very different. Using purely length-based algorithm could not produce high performance on aligning texts. This paper investigates the effectiveness of critical part-of-speech (POS) criterion on alignment under conditions of different search strategies and different register texts. Two metrics, recall and precis...

متن کامل

A massively parallel corpus: the Bible in 100 languages

We describe the creation of a massively parallel corpus based on 100 translations of the Bible. We discuss some of the difficulties in acquiring and processing the raw material as well as the potential of the Bible as a corpus for natural language processing. Finally we present a statistical analysis of the corpora collected and a detailed comparison between the English translation and other En...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Languages

سال: 2022

ISSN: ['2226-471X']

DOI: https://doi.org/10.3390/languages7010056